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1 Introduction 



Homophily patterns in networks have important consequences. For example, citations across lit- 
eratures can affect whether, and how quickly, ideas developed in one field diffuse into another. 
Homophily also affects a variety of behaviors and opportunities, with impact on the welfare of 
individuals connected in social networks^ In this paper we analyze a model that provides new 
insights into the patterns and emergence of homophily, and we illustrate its implications with an 
application to a network of scientific citations. 

Our main objective is to study how homophily patterns behave in an evolving network. Do 
nodes become more integrated or more segregated as they age? How does this evolution depend 
on the link formation process? In particular, does the network become more integrated if new 
connections are formed at random or if they are formed through the existing network? 

To answer these questions, we study a stochastic model of network formation in which nodes 
come in different types and types, in turn, affect the formation of links. We accomplish this bye 
introducing individual heterogeneity to the framework of Jackson and Rogers |19j . allowing us to 
focus on the issue of homophily generated through specific biases in link formation. A new node 
is born at each time period and forms links with existing nodes. The newborn node connects to 
older nodes in two ways. First, she meets nodes according to a random, but potentially type- 
biased, process. Second, the newborn node meets neighbors of the randomly met nodes ("friends 
of friends"). This is referred to as the search process and can also reflect type biases. To illustrate, 
consider citation patterns. Typically when writing a new paper, some references are known or 
found by chance by the authors while others are found because they are cited in known papers. 
Biases arise because papers may cite references with greater frequency within their own field. We 
examine the long-run properties of this model and the structure of the emerging network. 

The biases could arise from agents' preferences over the types of their neighbors and/or from 
biased meeting opportunities that agents face in connecting to each other. So, in one direction we 
enrich a growing network model by allowing for types and biases in connections, and in another 
direction we bypass explicit strategic considerations by studying a process with exogenous behav- 
ioral rules. Since in the model search goes through out-links only, strategic considerations are to 
some extent inherently limited, since a node cannot directly increase the probability of being found 
through its choice of out-neighborhood. While this may not be a good assumption in some con- 
texts, such as business partnerships or job contacts, where search presumably goes both directions 
along a link, it is appropriate in other contexts, such as scientific citations where the time order of 
publications strictly determines the direction of search. 

We wish to understand the conditions under which the network becomes increasingly "inte- 
grated" over time. We consider three different notions of integration. Under weak integration, 
nodes who are old enough are more likely to get new links than young nodes, independent of types. 

1 See [15] [17] [21] for more background and discussion. 
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In this sense, age eventually overcomes any bias in link probabilities. Under long-run integration, 
the distribution of types among the neighbors of a given node converges to the population distribu- 
tion as the node ages. This is a strong property that requires biases among neighbors to eventually 
disappear, despite the biased formation process. Finally, under partial integration, the type dis- 
tribution among neighbors moves monotonically towards the population distribution as nodes age, 
although it may maintain some bias in the limit. These notions capture different, but related, 
aspects of the idea of network integration. 

Our main theoretical results are as follows. First, weak integration is satisfied whenever the 
probability that a given node is found increases with that node's degree. This holds in any version 
of our model where at least some links are formed through search and there is some possibility of 
connecting across types. 

In contrast, long-run integration holds only when search is unbiased. That is, the random 
meeting process can incorporate arbitrary biases, but once the initial nodes are met, the new node 
chooses uniformly from the set of their neighbors, ignoring any further implication of their types. 
Finally, we show that under a particular condition on the biases, the process evolves monotonically 
and satisfies partial integration. In particular, the biases in nodes' links generally decrease with 
age. 

To understand where this tendency towards integration comes from, consider unbiased search. 
Observe first that as a node ages, the proportion of his links obtained through search approaches 
unity, since the number of neighbors grows with age while the probability to be found at random 
decreases with population size. Next, note that unbiased search does not imply an absence of bias 
among the neighbors of randomly met nodes. Due to homophily, randomly met nodes are relatively 
more connected with nodes of their own types. A critical fact our analysis uncovers, however, is that 
bias among neighbors' neighbors tends to be lower than among direct neighbors. This is because 
some nodes of other types are found at random and these nodes are relatively more connected to 
nodes like themselves. So the set of neighbors' neighbors has a more neutral composition than the 
neighborhoods of same-type nodes. Network-based search increases the diversity of connections 
and, conversely, nodes found through search are being found by a more diverse set of nodes. And 
since search plays a larger role with age, older nodes are less biased in their connections. 

In order to analyze network structure in more detail we consider a special, but natural, two- 
type specification of the model where random meetings are organized through a geographic or social 
space. Nodes of a given type are more likely to reside in a given location and random meetings 
take place without further bias in the various locations. In this model, biases in random meetings 
are inherently tied in a precise way to the type-distribution of the population. This feature allows 
us to obtain a number of further results. In particular, we derive an explicit formula relating a 
node's local homophily among neighbors to its age or degree. This illustrates our general results 
and further shows how partial and long-run integration are affected by changes in types' shares. 
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We also study two important structural properties of the resulting network that are less tractable 
in the general model: degree distributions and group-level homophily. We show how to modify 
the existing analysis of degree distributions to account for individual heterogeneity and homophily, 
obtaining new insights. 

In addition, we obtain results on group-level homophily consistent with empirical results pre- 
sented in [9] [TO]. We find that relative group size has an important impact on how meeting biases 
map into aggregate properties of the network. In particular, relative homophily in the network 
is strongest when groups have equal size, and vanishes as the groups take increasingly unequal 
sizes. Turning to degree distributions, we find that the majority and minority groups have different 
patterns of interactions. In particular, for the minority group, links from their own group are on 
the one hand rarer due to a size effect, but on the other hand a homophilic bias pulls in the other 
direction, creating a tension in the overall distribution of links. However a striking result that is, 
in principle, testable is that the distribution of total in-links is identical for the groups independent 
of their relative sizes. 

Moving from the theoretical analysis, we illustrate the implications of the model using data on 
scientific citations in journals of the American Physical Society (APS) published between 1985 and 
2003. We find that the proportion of citations that a paper obtains from other papers in its own 
field decreases as the paper ages and becomes more cited. The observed citation patterns provide 
some evidence of the partial integration property and are at least partly consistent search follows 
a less biased (possibly unbiased) pattern in the citation process. In studying this application 
we are motivated by two factors. First, patterns of scientific citations have important welfare 
consequences as they affect the diffusion of knowledge, with impacts on different research fields 
Previous research, such as |13[ 126] . generalizing popular concepts such as the recursive impact 
factor, stress that the importance of a citation relies on the paths that it allows in the network of 
citations. We complement this argument by considering under which conditions citations are likely 
to bridge scientific production across different communities If] Second, scientific citations possess all 
the features of the network formation process that we study: nodes (papers) appear in chronological 
order and never die, they link directionally to previously born nodes, they have types (scientific 
classifications), and they find citations both directly and though search among the citations of other 
papers@ 

2 See, for instance, [51 fH]. 

3 Ref. [32] studies cross-field citations in the scientific production of the 90's, for three different datasets. 

4 These longitudinal aspects of citation networks have motivated the use of growing network models in previous 
papers including the seminal work on citation networks by Price [271 128j. Refs. [3l [35], among others, find that 
citations on the PNAS on a 20 years interval show some aspects of a bias towards recently published papers, while 
Refs. [251 131| , correcting for cohort size and idiosyncratic popularity, find an age effect (first mover advantage) and 
a frequency distribution of in-citations that are consistent with a growing network model such as the one that we 
develop here. Finally, Ref. [33] find a positive correlation between homophily of out-citations and the number of 
in-citations, but this effect is valid only for low number of in-citations. 



4 



More generally, our study contributes to a growing literature in economics and other disciplines 
studying the causes and consequences of homophily in social networks. Refs. (9J [10] study a 
matching process of friendship formation. They document several empirical patterns of homophily 
and explain them through a combination of biases with respect to choice and chance. By design, 
all individuals have the same degree and age has little impact. In contrast, differences in age 
and degree are central to our analysis. Ref. [16] incorporates homophily into the random graph 
model of [6j [7]§l Again by design, homophily is not affected by degree or age in this approach. 
Thus, our study and these two papers study homophily patterns in networks from complementary 
perspectives. In particular, we provide the first study of how homophily patterns change over time 
and of the relation between homophily and a node's degree. 

This study also advances the analysis of stochastic models of network formation. Earlier work 
has made great progress in explaining structural network features such as small diameter, high 
clustering and fat tails in degree distributions, see [UE1 Q33 [231 IM1 EZ] • However, most of these 
studies assume homogeneous agents and neglect homophily. With respect to this literature, we 
develop and study one of the first stochastic model of network formation incorporating individual 
heterogeneity. 

The rest of the paper is organized as follows. Section [2] presents the model with bias only in 
the random meeting process. Section [3J includes the main result about long-run integration in this 
setting. Section d] studies the special case of two- types and location based bias. Section [5] studies 
the integration properties of the model when biases appear also in the search part of the meeting 
process. Section O contains the empirical application to citation data. 

2 Homophily in a random meeting process 

In our model, nodes are born with randomly assigned types and enter sequentially, meeting existing 
nodes upon entry. Meetings result in (directed) links. Meetings take place through two distinct 
processes, which we refer to as random and search. The meeting processes depend on the types 
of the nodes involved. In this section, we study the impact of type-based biases on the random 
meeting process. 

2.1 The model 

Time is indexed by t = 1, 2, In each period a new node is born. We index nodes by their birth 

dates, so that node t is born in period t. 

Nodes have "types," with a generic type denoted 9 belonging to a finite set (with cardinality 
H). A newborn's type is randomly drawn according to the time invariant probability distribution 

Ref. [12] uses this extension to study how homophily affects communication dynamics in networks, demonstrating 
explicitly one way in which homophilic structure impacts outcomes, as does |18j . 
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p (so that types are i.i.d., across time). 

A newborn node sends n > 1 (directed) links to the nodes that were born in previous periods. 
Of these n links, a fraction m r selects nodes according to a type-dependent random process^] these 
nodes are called "parents" . The remaining fraction m s = 1 — m r selects nodes among the neighbors 
of the nm r parents that have been found via the random process; we refer to this second part of 
the process as "search" We define a = m r /m s to be the ratio of the number of links formed by 
the random process to the number of links formed by the search process. 

Looking first at the random part of the process, we denote by p{6, 9') the probability that a 
link sent by a node of type 6 reaches a node of type 6' . Among nodes of type 6', the link is formed 
uniformly at random, so there is no further discrimination in this part of the process. If the random 
meeting process were unbiased, the probability p(8, 9') would equal the share p{6') of 0' agents in 
the system. When p(9, 9') ^ p(0') we say that there is bias. This can be interpreted in different 
ways. One can view the bias as a reduced form for preferences that nodes have over the type of 
connections they form. The case of "homophilistic" preferences for type 9 is then captured by a 
situation in which p(6, 6) > p{9). The bias could also arise from constraints in the meeting process, 
or from spatial differentiation, as in the location model that we will analyze in Section \w\ 

Turning now to the search part of the process, the way in which friends are drawn from parents' 
neighborhoods may be, in principle, either biased or unbiased. Much of the paper will study a 
process with biases only in the random part, so that in the search part, links are formed according to 
a uniform distribution on the set of parents' neighbors. This assumption has natural interpretations 
and various applications. It applies, for instance, to cases where agents face a bias in meeting 
strangers, but then get to meet the "friends" of their new friends without bias. When the original 
bias in meetings comes from biased opportunities, this seems to be a natural assumption; when 
the bias originates in preferences, it may still be the case that this bias tends to vanish when 
meetings are mediated by friends. In Section H] we will explicitly analyze a model that relates these 
biases to location-based differences in the meeting process. When search is itself biased, so that 
the additional nm s nodes are found among parents' neighbors using a type-dependent probability 
distribution, two types of biases are naturally defined: a bias that discriminates according to the 
types of the parents through which search is made, and a bias that discriminates according to the 
types of the parents' neighbors. Which type of bias is more appropriate depends on the instance 
of network one has in mind, and leads to formally different models of link formation. In Section [5] 
we study the case of biased search and its consequences for integration. 

6 m r n is an integer in the underlying process, but allowed to be arbitrary in the mean-field continuous-time 
approximation we analyze. 

7 In the underlying process, if some node is found to which the newborn is already connected, then the node is 
redrawn. If there are too few new nodes in the neighborhoods of the nodes found in the first part of the process, 
then the random nodes are redrawn. To ensure that the process is well-defined, we begin with a set of n 2 nodes in a 
sequence, each connected to all predecessors. 

8 See [9] [lOl [TTJ for more details on other models that can justify this reduced form. 
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Before formally deriving the dynamics of the various processes, in the next section we propose 
three notions of integration that measure the extent to which the bias in the random and /or search 
process translates into biases in the long run type-patterns of link formation. 

2.2 Integration 

The definitions we provide capture different aspects of integration, focusing on how a node's type- 
pattern of connections evolves with age, and whether it gets progressively more (or less) integrated 
with the rest of the network. 

It is important to note that there are two different aspects of integration: the evolution of 
newborns' newly formed links (out-links) and the evolution of older nodes' incoming links. These 
will exhibit different dynamics. Given the bias in the random part of the network formation process, 
it is clear that there will always be some bias in the out-links of newborn nodes. The main questions 
with regard to the out-links thus pertain to how the links formed by search behave over time, and 
this is related to the question of how the in-links of older nodes behave. 

All three notions of integration discussed here pertain to the behavior of in-degrees of nodes. 
Out-degree dynamics are studied in Sections 13.31 and 14.31 

Our first notion requires, in particular, that old enough nodes are found by newborn nodes with 
higher probability than younger nodes, independently of the types of the nodes involved. 

Definition 1 The network formation process satisfies the weak integration property if for every 
to, there exists t > to such that, for all t' > t and for all 9 £ O, the node born at time t' has a lower 
probability than node to to receive a link from a node of type 9 born at time t' + 1. 

Note that this form of integration requires that an old enough node of type 9 ends up receiving 
a link from a newborn node of type 9' with a higher probability than a young enough node of the 
same type 9' as the newborn. So, even if link formation probabilities are biased in favor of similar 
nodes (homophily), old enough nodes are found more often even when of a different type than the 
newborn. 

This form of integration is rather weak, and does not bear implications on the type-composition 
of any given node's in-degree. Our second notion of integration requires that as nodes age, their 
local neighborhood grows to represent more and more the type composition of the population. It is 
therefore a "monotonicity" property, requiring that integration, here defined in terms of how close 
the composition of neighbors' types is to what would obtain in the unbiased case, grows with age. 

Definition 2 The network formation process satisfies the partial integration property if for 

every node to the fraction of each type 9 in the in-degree of to is weakly closer to 9 's population 
share at time t' than at time t, for t' > t, and strictly closer for some types. 
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So, under partial integration, the in-neighbors of an agent become more and more representative 
of the overall population as time elapses. 

Our final notion of integration is strongei§| and requires that nodes eventually attract in-links 
according to population shares. 

Definition 3 The network formation process satisfies the long-run integration property if for 

every node to the proportion of each type 8 in the in- degree of to converges to 8 's population share 
as node to ages. 

In other words, in the long run any surviving difference in the proportion of links received by 
old nodes from different types is due only to the distribution of types in the population, and the 
biases in link-formation have no consequences for eventual in-degree patterns. 

3 Integration with biased random meetings and unbiased search 
3.1 Model dynamics 

A benchmark model to study long run integration properties of link formation is one where only 
the random part of the process is biased, and no further bias is present in the search part of the 
process. More precisely, the search process is unbiased in the sense that additional ties are found 
through a uniform sample among parents' neighbors, but remains indirectly biased through the 
bias that the random process has induced on the type composition of the parents' neighborhoods. 
This model allows for a clear understanding of the mechanics that lead to integration, and why and 
when integration may fail. 

We study a continuous time approximation of the model, using the techniques of mean-field 
theory. This provides approximations and limiting expressions of the process that ignore starting 
conditions and other short-term fluctuations that can be important in shaping finite versions of 
the model, and so the results must be viewed with the standard cautions that accompany such 
approximations and limit analysis. We consider the expected change in the discrete stochastic 
process as the deterministic differential of a continuous time process. 

Let us first look at the probability that node j is found by newborn node t + l. This depends on 
the shape of the network that has formed up to time t. In particular, it depends on the type-profile 
of in-neighbors of j at time t, and on the bias of the newborn node towards such types. Since 
search is not type-biased, each link that agent t + l forms through search is drawn from a uniform 
distribution over the set of all neighbors of all parent nodes that agent t + l has found at random. 

9 By stronger we do not mean that it is a necessary condition for the partial integration property defined above. 
One could think of partial integration as a criterion of monotonicity of a function in one variable, while the long-run 
integration defined below is a criterion of convergence, that could however also be non-monotonic. 
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Letting Pj(Ot, Qj) denote the probability that a node born in period j of type 9j receives a link 
from a node of type 9t born at time t > j, the following expression is a mean-field approximation 
of the overall linking probability: 

P s (9,9 j )-nm r tp{0 ^ +nm,^p(5)p(fl,fl) ^ (1) 

The first term on the right-hand side captures the probability of node j being found at random. 
The probability that node t + 1 is of type 9 and links at random to a node of type 9j is p(9)p(9, 0j). 
This is divided by the number of nodes of type 9j at time t which, under a mean-field approximation, 
is equal to tp(9j). It is then multiplied by the number of links formed at random, nm r . 

The second term is the probability of node j being found through search. It is given by the 
number of search links (nm s ) formed by the node born at t + 1, times the sum, over all possible 
types 9', of the probabilities that j is found through a node of type 9'. For each possible type 9', 
this probability is given by the joint probability of the following events (corresponding to the four 
terms in the first summation over types): (i) the newborn node is of type 9; (ii) it forms a link 
with a 0'-type node; (in) the #'-type node has linked to j since j was born|^J (iv) among the n 
neighbors of this #'-type node, that exactly j is found. 

It is useful to express the terms of the above formula in a compact way. For all 9, 9' we write 

v(9,9') 



B r (9,9')=p(9)- 



p(9>) 



Note that the ratio ~7grr in the above expression is a measure of the bias that type 9 applies to 
type 9', so that when this ratio is 1 there is no bias, while when it is greater (less) than 1 there 
is a positive (negative) bias of type 9 towards type 9'. In the case of no bias, B r (9,9') is simply 
the probability of birth of a type 9 node, and Pj +1 (9,9j) is n times the joint probability that 
the newborn node is of type 9 and that node j is found by drawing uniformly at random from a 
population of t nodes. We can decompose the matrix B r as the product of two matrices A and Q, 
where A may be seen as a transition matrix of a Markov process (a Markov matrix) and Q is a 
diagonal matrix where the diagonal is a probability vector: 

B r = QAQ" 1 , (2) 

with 



A eg , = p(9, 9') and Q 



(Pi)) 



V ... p(H) 



10 Note that this ratio has the total (expected) number of links received by agent j from 6' agents up to time t as 
numerator, and the total number of 9' nodes in the system at time t as denominator 

11 In |Appcndix A| we derive some general results on Markov matrices that will be useful in |Appcndix B| where we 
prove our propositions. 
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Using the matrix B r , equation (|TJ) becomes: 

P- 1 = ^B r + ^B r ± Pj , (3) 

\=t 

where ^A=t ^*t expresses the expected in-degree, type by type, after time t for a node born at 
time to- 

We define 

A=t 

With a continuous approximation: 

^ rr* — -p*+! 
M t0 ~ * ' 

We study equation ([3]) in terms of ordinary differential equations in matrix form: 



-n<, = _£B r + ^B,.n;, (4) 



with the initial condition 



From now on we will always assume that B r is invertible (so that the specification of types is 
not redundant). With this assumption, the unique solutions to these differential equations are the 
following: 



m r lit 



m s B r 



where a constant to the power of a matrix is defined as follows: 



3.2 Integration 

Let us test the various notions of integration on this model with unbiased search. 

It is clear that the model with m r = 1 cannot satisfy weak integration. We show instead that 
whenever there is some degree of search (m r < 1) weak integration is satisfied. In fact, in Section 
[5]we strengthen this result to show that weak integration is still satisfied when search is biased as 
well. 

Proposition 1 If m r < 1, the model with unbiased search satisfies the weak integration property. 
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The proof (which appears, along with the proofs of our other results, in Appendix B) shows that 
the weak integration property is not specific to the unbiased search model. Indeed, various models 
in which the in-degree of a node determines the probability of being found by a newborn node in a 
sufficiently increasing manner would give the same result. Moreover, search is not needed for this 
type of dependence to take place. Another model with "type-biased" preferential attachment in 
which the probability of receiving a link is positively correlated with a node's in-degree, and which 
exhibits the same weak integration property, is discussed in the conclusion of [TT] . 

Partial and long-run integration are, again, not satisfied when m r = 1. The next propositions 
show that, otherwise, the long-run integration property is always satisfied by the model, while the 
partial integration property needs an additional assumption. 

Proposition 2 Ifm r < 1, the model with unbiased search satisfies the long-run integration prop- 
erty. 

Partial integration, instead, occurs under an additional condition. Consider a Markov matrix 
M. As formally stated in |Appendix A writing M = lirm i _> 00 1VP, we say that M satisfies the 



monotone convergence property if, for every pair i,j € {1,...,H}, and for every /x € N, the 



element Mf- satisfies: 

1. if Mij > Mij, then My > M% > M£ +1 > i% 

2. if My < My, then A% < M£ < M£ +1 < My. 

The monotone convergence property captures the idea that transition probabilities are monotone 
over time. Even with a strictly positive transition matrix, this condition does impose additional 
restrictions @ It is beyond the scope of this paper to find general or even necessary conditions for 
monotone convergence of Markov matrices. 

We then have the following result. 

Proposition 3 Ifm r < 1 and A satisfies the monotone convergence property, then the model with 
unbiased search satisfies the partial integration property. 

Let us focus on the intuition behind the long run integration property of the model with unbiased 
search. To fix ideas, let us examine the case in which random probabilities have a homophilous 
bias. A given node can be found by a newborn node of a different type via search in different ways: 



12 As a simple illustrating example, consider a Markov process with three states where transitions from state 1 to 
state 2, 2 to 3, and 3 to 1 occur with high probability, and with the other transitions occurring with small but positive 
probabilities. Then in one period going from 1 to 2 is likely, but then it is unlikely to occur in two periods or three 
periods, but more likely in four periods, and so forth. Things eventually converge to equal likelihood on all states, 
but convergence is not monotone. One can also find such examples that are more complicated where homophily is 
present. 
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one is that the newborn finds a neighbor of the given node that is of the same type as the newborn, 
and another is that the newborn finds a neighbor of the given node that is of the same type as the 
given node. The first way is relatively more likely given the homophilous bias in the random part 
of link formation, but the fact that this can also occur via the second route leads this process to 
be less biased over time. Once the process has become less biased, it even easier to be found by 
nodes of other types, and so the neighborhood becomes even less biased, and this trend reinforces 
itself leading to an unbiased process in the limit. To summarize, as a node ages it becomes more 
of a "hub", attracting many links from all types in the search process. This property, that also 
underlies the weak integration property, together with unbiased search further decreases the bias 
in the in-degree of hubs. As a result, the type composition of new connections becomes even less 
biased for these hubs, and eventually the bias is eliminated. 

The way in which an individual's neighborhood composition limits to the population frequencies 
as it ages is non-trivial. Notice that if a particular individual became connected to a large proportion 
of others over time, then his neighborhood would necessarily approximate population frequencies. 
However, we emphasize that in our model, even though an individual's degree grows without bound, 
the proportion of others to whom he is connected still vanishes over time, so this effect is not what 
drives integration. This happens because the entry rate of new individuals is constant, while the 
probability for existing individuals to acquire a new link in any given period goes to zero. 

Finally, we remark that, while the neighorhood of every node approaches a composition that 
reflects the aggregate population frequencies, it converges to that distribution from a point that 
is affected by biases in the link formation process. Since those links are perpetually being formed 
and are subject to biases, the system never approaches a network that has unbiased link patterns. 
Rather, it is always the oldest nodes in the system that have the least biased neighborhoods. In 
fact, one way to see the persistent bias is to focus on out-degree rather than in-degree. Thus, we 
turn now to analyzing links by tracking where they originate. 

3.3 On the dynamics of out-degrees 

So far we have mostly focused on the dynamics of agents' in-degree. Of course, out- and in- 
degree dynamics are intimately related, as the search part of young nodes' out-degree will consist 
predominantly of old nodes, with respect to whom the search part of the process is both directly and 
indirectly unbiased (see Section EO|) . Here we take a close look at the composition of out-degrees 
and how they evolve over time. This is of interest not only to better understand integration, but 
also to shed light on the evolution of homophily, that is, the tendency to form ties with agents of 
the same type. 

We first look at the steady state composition of the out-degree. Let us denote by dijj the 
proportion of links that originate from a node of type i born at time t that are directed towards 
nodes of type j. The evolution of these proportions is given by: 
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d ijit+1 = (1 - m s )B r (i,j) + m s ^2 B r (h K ) T= \ — " • (?) 

h=i 1 

The out-degree depends on the random part (first term) and on the search part (second term) 
through the average out-degree of existing nodes. In matrix form, this is written as follows: 

D m = (1 - m s )B r + m s B r ^ T =' Dt . (8) 

To get a feeling for the limit of this process, it is useful to examine the steady state D of this 
system. The steady-state is such that the out-degree of each type remains unchanged in time: 

D = (1 - m s )B r + m s B r D . (9) 

Proposition 4 // m s < 1, then the steady state equation (OJ) has a unique solution D, and the 
system in (0) converges to D. | 

For m s < 1, the second term approaches the null matrix as t — > oo. As long as the matrix Di 
is more biased than the steady state D (which is true for Di=A), the bias in excess of the steady 



state decreases with time, vanishing in the long run (see also Appendix A). 

This means that the biases in the out-links formed by agents decrease over time, consistent with 
the homogenization of the search process and the in-degree of older nodes which are dominating 
the search part of the process. However, unlike the case of the in-degree of old nodes, full homog- 
enization does not occur even in the limit, since the random part of the out-degree formation does 
not vanish over time. 



4 Location— based biases 

In this section we consider a specific form of bias in random meetings, and restrict the analysis 
to two types for simplicity. By making explicit how the bias in random meetings is generated, we 
accomplish two goals. First, we generate a closed form expression that describes the integration of 
individuals as they age. This formula allows us to study in more detail the integration process, and 
provides parameters which can be empirically estimated. Second, we obtain additional results on 
other features of the network, specifically on aggregate homophily at the group level and in-degree 
distributions that are type-sensitive. For each of these categories of results, the location-based 
nature of the meeting biases allows us to study the impact of changes in population frequencies on 
the structure and properties of the emerging networks. 

Nodes belong to one of two types: 6\ and 02- With an abuse of notation, wee let p{9\) = p and 
p{@2) = 1 — p in this section. There are two locations L 1 and L 2 . All biases in the meeting process 
are captured by the parameter 7 € [1/2,1], which represents the probability that a B{ node goes 
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to location L l , i = 1,2@ Once assigned to a location, each agent meets m r n nodes uniformly at 
random among all individuals present at this location. Thus, it is simply the resulting composition 
of types in the two locations that permits any type-dependent biases in the model. We maintain 
the assumption that the search part of the process is unbiased. 

At any time t, the expected number of 9\ nodes at L 1 is pyt, while the expected number of 
02 nodes in L 1 is (1 — p)(l — j)t. Thus, the proportion of 9\ nodes in L 1 is p ^ + ^ 1 ^ ) ^ 1 _^ while 
the proportion of 9\ nodes in L 2 is ^ J^7^_ p ) 7 ■ The probability that a node of type 9\ links at 
random to a node of the same type is thus: 

P(0i, Oi) = 7 ZTT^Ti V + 0- ~ 7) n ^YTW \ ^ 

PY + (i -p)(i -7J p(i - 7) + (i - P)l 

and p(02, O2) is obtained by symmetry exchanging p and 1 — p. 

Thus, the model generates a simple explicit relation between population frequencies and random 
meeting biases, controlled by the parameter 7. Note that when 7 = 1/2, locations are independent 
of types and there is no bias: p(9i,9{) = p(9{). In contrast, when 7 > | random meetings are 
biased towards own group and p(9i, 9{) > p(9i). At the extreme when 7 = 1, locations are perfectly 
correlated with types and individuals meet others only from their own group so that p(9i,9i) = 
1. This allows us to derive a number of comparative statics results with respect to population 
frequencies below. 

4.1 Explicit formulas for long-run integration 

Using the expressions above we have 

B= ( T ^ p (l-p(9 1 ,9 1 ))\ 
r { i=£(i-p(0 3 ,0 2 )) p(9 2 ,9 2 ) y 



We note that Proposition [3] (together with Lemma [5] in Appendix A) implies that if m r < 1, 
then the location-based model always satisfies the partial integration property, because p(Q\, 9\) > \ 

andp(0 2 ,0 2 ) > \- 

We can now solve equation @ in terms of p and 7 to obtain 

Lemma 1 The in-link composition at time t of a type 9\ node born at time to is 



14,(1,1) = n^U-) ms + (l-p)(-) fems -l) (12) 

m s \ zq 1, 

m r „ , ( , t , m ,t 



4o» 

n* (2,l) = n— (l-p)l(-)^ -(-r ) . .1.1) 
m s \ to t 



13 The analysis below assumes away some implicit correlations in the meeting process described here. This is as if 
(modulo matching issues) each new node in g l spends a proportion 7 of his time in L % , and the probability of meeting 
any existing node is proportional to the time spent with it in the same location. 
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with the analogous expressions for type 62, where 

7(1-7) 



b = P (d 1 ,e 1 )+ P (9 2 ,e 2 )-i = i 



p(l-p)(2 7 -l) 2 + 7(l-7)' 



We can now show that the location model generates a simple, explicit relationship between 
integration and in-degree at the individual level. This allows us to illustrate the results from the 
previous section and to obtain further predictions on the shape of integration. In particular, we 
find that the amplitude of integration tends to be lower in larger groups. 

We denote an individual's in-degree by k, which is a function of the node's entry date to an d 
the individual's age. Then r 3 {k) denotes the individual share of same type in-links for a node of 
type 6j at the time when its degree is k. 

Proposition 5 Suppose that biases are location- driven. Then, r 3 (k) is described by 

[1 + km s / (nm r )) b — 1 



r\k)=p(9 j ) + {l-p{e ] ))- 



km s /(nm r ) 



The individual share of same type in-links is thus decreasing with k and convex in k, converging 
asymptotically to the population shares. Moreover, for k > 0, dr 3 /dp 3 (k) > and d 2 r 3 /dp 3 dk(k) > 
0. 

Thus, consistent with the general long-run integration result of Proposition [21 r 3 (k) converges to 
the population frequency p(9j) as k and, hence, time, diverge. The convergence is monotonic, and 
so satisfies the property of partial-integration described in Proposition 3. Notice that by application 
of Lemma D in Appendix A, one could demonstrate partial integration without the explicit solution 
contained in Proposition [5j However, the formula delivered by Proposition [5] allows us to derive 
some further implications. First, the relation between r 3 {k) and in-degree is convex, so its decrease 
with age takes place at a decreasing rate. Second, and perhaps more importantly, the relation 
between r 3 (k) and degree tends to be flatter in larger groups; the difference in integration between 
low-degree and high-degree nodes is smaller in larger groups. Third, this function could be readily 
fitted to data. In contexts where information on how links are formed is lacking, this could provide 
the basis of an empirical analysis of the model. This approach is illustrated in [3]. 

4.2 Cumulative link distributions 

We turn now to a more detailed discussion of the distribution of links across nodes in the network. 
Proposition [5] makes explicit the relationship between the degree of an individual and the local 
composition of its in-neighbors, demonstrating, in particular, the properties of partial and long- 
run integration. Integrating that relationship in order to obtain a measure of group-level homophily 
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requires knowing the distribution of in-degree across individuals. This section is concerned precisely 
with analyzing those degree distributions, which have become a cornerstone of social network 
analysis. 

Even with only two groups, capturing the distribution of links becomes substantially more 
complex, relative to the one-group case, as nodes can connect to both same- and different-type 
nodes, and one wants to keep track of the different kinds and sources of links. In this context, we 
can keep track of seven different degree distributions rather than one. Define as the distribution 
of the in-degrees of type 9i nodes paying attention only to links coming from nodes of type Oj, 
i,j = 1,2. Then F\ and F2 are the standard in-degree distributions of 9\ and 62 nodes (ignoring 
the types of neighbors), and finally F is the total in-degree distribution of the entire society. 

We observe that as a consequence of Lemma [IJ all of the degree distributions have a power-law 
upper tail, as has been documented extensively in empirical contexts, starting from [lj. Further, we 
are able to make predictions about how the distribution of links from own- and other-group nodes 
relate to each other, making clear the importance of whether a node is in the majority or minority 
group. Finally, we show that changing the bias in location-based meetings causes the distributions 
to shift in the sense of first order stochastic dominance (Proposition [5]) . 

To begin the analysis consider, for example, F\. Observe that F\{k) = 1 — to/t, where t is an 
arbitrary time period and, by definition, to is the node that has in-degree k at time t. to can be 
solved for under the mean-field approximation. This defines F\ implicitly as a function of k, and 
an analogous method works for the other distributions. While these equations do not usually yield 
closed-form solutions, they still allow us to derive important properties of the degree distributions. 
Our first such result orders the degree distributions as the number of out-links is varied. 

We first ask how Fn and F& compare to each other. That is, we focus on one group 0j, and for 
those nodes we compare the (distributions of) the number of links coming from the the own group 
and the other group. We find that the relationship depends on the size of group i. 

Proposition 6 Fix p > 1/2 so that the majority group is group 1. Then 

(i) F n FOSD F 12 ; 

(ii) If 7 < 1, then F22 never FOSD F 2 \; 
(Hi) F 2 i FOSD F22 if and only if b < 2Eri; 
(iv) Fi = F 2 

These results express the interplay of two effects. On the one hand, there is a direct size effect 
through which nodes receive more links from the larger group. On the other hand, homophily leads 
nodes to receive relatively more links from nodes of the same group. In the larger group, both 
effects are aligned which implies that Fu FOSD F\2- In the smaller group, however, these effects 
pull in opposite directions. 

The third item in the proposition says that if homophily is not too large, the size effect dominates 
and F21 FOSD F 2 2- The condition in part (iii) requires that b, and hence 7, be lower than or equal 
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to some threshold valuer I We can see that this threshold is increasing in p. As the size of the larger 
group increases, the size effect becomes relatively more important and F21 ends up dominating F22 
for a larger range of the parameters. In contrast, the second item says that even if homophily is 
large, as long as it is not perfect (7 < 1), the homophily effect cannot dominate the size effect. The 
explanation lies with nodes of high degree. We can show that, in the upper tail, F22 always lies 
above F21. In other words, the size effect dominates for the hubs of the smaller group, and they 
tend to get relatively more connections from nodes of the larger group. This is related to partial 
integration: the largest degree nodes are more integrated with respect to their in-degree. In other 
words, the hubs in the minority group have the greatest proportion of their in-neighbors from the 
majority group. 

The last part provides a particular empirical prediction, as independent of the homophily biases, 
the relative group sizes, and the proportion of links formed through the random meeting process, 
the in-degree distributions of the two groups must be identical. 

The final result in this section describes how the distributions of inter- and intra-group links 
respond to changes in the meeting bias. 

Proposition 7 Assume biases are location-driven. Fix p,m and a and take 7 < 7'. Let F^ be the 
distributions corresponding to 7 and let F'- be the distributions corresponding to 7', for i,j = 1,2. 
Then F[^ and F 22 strictly FOSD F\\ and F22, while F21 and F\2 strictly FOSD F 21 and F[ 2 . 

When the meeting bias increases, no matter the group sizes, individuals tend to form more links 
within their own groups, and fewer links across groups. 

4.3 Long run homophily and group size 

This section complements the general analysis of Section [3731 with a detailed inspection of the steady 
state out-degree composition in the two type location based model. In this specific context, our 
aim is to understand how the built-in homophily in random meetings translates into biased long 
run proportions of out-links that stay within a group, and how these proportions relate to groups' 
frequencies. Proposition [8] below tells us that group-level homophily is strongest when the groups 
have nearly equal sizes, and vanishes at the extreme when one group dominates society. Further, 
Corollary [TJ tells us that, for given population frequencies, group-level homophily is stronger when 
the bias in random meetings or the relative prevalence of random meetings is higher. 

In preparation for this result, define the homophily index H(9i) as the expected proportion of 
the links formed by a new 9i node that are with same-type nodes. At the steady state, H{9\) and 
H(02) satisfy the following equation 

£T(0i)n = m r np{9 u 9 x ) + rn s n[H(9 1 )p(9 1 ,9 1 ) + (1 - H(9 2 ))(l - p(9 2 , 9 2 )}, 

14 Using the definition of b, from Lemma [TJ the threshold can be written 7(1 — 7) > p(l — p)/(l + 2(1 — p)(2p — 1)). 
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as well as its symmetric counterpart. We know from the results of Section 13.31 that the steady 
state solution will be greater than p(9i), as the bias in out-links does not vanish with time, since 
the random meeting process always constitutes a non-trivial portion of out-degree. This equation 
decomposes the expected number of links formed within group as the sum of two terms capturing 
links formed at random and links formed through search. At random, this number is by definition 
proportional to the probability p(9i,9\). Through search, there are two ways to connect within 
the group, depending on whether the intermediary node is of the same type or not. Solving these 
equations yields, for type 9\, 



recalling that a = m r /m s represents the ratio of the number of links formed at random to the 
number of links formed through search. Combining this expression with equation (jlOp . we obtain 
an explicit formula linking group homophily and population frequencies, controlled by the ratio of 
random to search meetings a and the bias parameter 7. In particular, we can see that H(9i) is 
increasing in a. As a tends to zero and search dominates the network's formation, H(9i) tends to 
p(9i), while when a becomes large and random meetings dominate, H(9{) tends to p(9i,9i). 

This means that the larger the role of search in the network formation process, the more 
integrated the society becomes, consistent with the intuition obtained from the general setting. 
This provides an expression at the group-level of the idea that search tends to reduce the imposed 
bias in meetings. The reason for this is that nodes found through search are more likely to be of the 
other type than nodes found at random, due to the possibility that the intermediary node is of the 
other type. As search dominates, homophily disappears completely and links are formed according 
to population frequencies. On the other hand, when the random meeting process dominates, links 
are formed according to the probabilities determined by the location-based meetings. 

To analyze how group-level homophily varies with group size, we find it useful to look at a 
normalized index introduced by [8] (see [S]). Define relative homophily (or imbreeding homophily) 



Relative homophily is positive when a group forms a higher proportion of its links within the group 
than would be implied by the population sizes, and is normalized to have a maximal value at 
unity. Again, from the results of Section 13.31 relative homophily will be positive in steady state. 
However, we can now demonstrate a more detailed relationship between relative homophily and 
relative group size. The following result shows how relative homophily changes as the composition 
of society varies. 

Proposition 8 I Hi is symmetric around p = 1/2. It is equal to zero at p = and 1; it increases 
from p = to p = 1/2, and decreases from p = 1/2 to p = 1, and is concave. 




p(e 1 ,9 1 )a + i- P (e 2 ,e 2 ) 



a + 2-p(9 1 ,9 1 )-p(9 2 ,9 2 ) 



as follows 
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Thus, in the extreme cases where one group dominates society, relative homophily disappears. 
Natural mixing occurring at each location tends to homogenize meetings, and this effect overcomes 
the impact of location biases when sizes are asymmetric. In all other cases, however, relative 
homophily is positive, and is strongest for intermediate size groups, reaching a maximum when the 
groups have equal size. Interestingly, the equilibrium mixing model of [U] generates an analogous 
result through a different analysis, and this prediction is supported empirically looking at racial 
composition in the AddHealth data. 

The next result shows how relative homophily responds to changes in the meeting process. 

Corollary 1 IH\{p) is shifted up by an increase in a or by an increase in 7. 

That is, for a given society and homophily biases, decreasing the role of search-based meetings 
increases relative homophily. This results from the "dampening effect" of search-based meetings 
on network homophily. The more prevalent are search-based meetings in the formation process, 
the lower homophily will be, as "friends of friends" are less likely to be of the same type than are 
individuals met through the biased random meetings. Finally, increasing the location bias, all else 
equal, increases relative homophily for any value of a, since this is the parameter that controls the 
extent to which random meetings are exogenously biased. 

5 Integration with biased search 

We now allow for the search part of the process to be biased as well as the random part. We do 
this for two main reasons. First, we want to assess what degree of integration is still compatible 
with this more flexible specification of biases in the network formation process. Second, the more 
realistic assumption of some form of bias also in the search process is needed to match the empirical 
patterns of scientific citations that we study in the final section of this paper. 

To prepare for this more general version of the model, remember that in the analysis of Section 
[3l the random bias affects the choice of parents that are used to find the additional nm s search 
connections. In general, the bias that affects the search process may differ from, and possibly be 
a reinforcement of, the bias induced by the random process. This additional bias is described via 
an H x H matrix where each element is positive and of the form B s (9, #')@ A value of 1 indicates 
no additional bias, while a value greater (less) than 1 indicates a positive (negative) search bias 
of type 9 towards type 6' . This is the distortion in the relative probabilities with which type 6 
searches the out-neighborhoods of its parent nodes. 

The mean-field approximation of the process is described by 

i>*+i( Mi ) = m^B r {9,e 3 ) + ^Y.e'=iBr{o,e')B s {e,e')Y! x=] P^{e',e ] ) . (14) 

15 There are constraints on this bias matrix to have the resulting output be well-defined probabilities, but much 
can be deduced for general forms of the matrix, and so we only specify the (obvious) constraints as they become 
necessary. A more rigorous treatment and some explicit examples are in [llj . 
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The product B r (8, 9')B S {9, 6') in ([H]) describes the probability applied by type 9 to the selection 
of random nodes in search of type 9j. Note that the bias is independent of both time variables j 
and t, and of the type Oj of the target. 

In matrix form, the system becomes: 



pW = ^B r + ^(B s0 B r )^Pt , (15) 

X=t 

where is the Hadamard product: (B s B r W^ = B s ^ ■ B r ^ . 
From the decomposition given in equation ((2J), it follows that 

B s 0B r = B s (QAQ -1 ) = Q (B s A) Q 1 , 

where the biases are such that B s A is still a Markov matrix. 

The model with unbiased search, analyzed in previous sections, is a special case of this model, 
where the matrix B s is a matrix of all l's. 

As we will see, weak integration still holds in the presence of biased search. Moreover, while 
long-run integration generically does not hold, partial integration occurs under an additional mono- 
tonicity condition. In this case, while the distortions in type frequencies among an individual's 
neighbors decrease over time, they do not vanish as the node accumulates links. 

Proposition 9 If m r < 1, the general model with biased search satisfies the weak integration 
property, generically does not satisfy long-run integration, and satisfies the partial integration 
property provided that B s A satisfies monotone convergence. 



6 An empirical application to citation data 

In this section we use our random-search model to study the patterns of cross-subfield scientific 
citations in physics. 

The use of scientific citation data is motivated by several factors. First, there is a literature 
that shows that key aspects of the time evolution of citations can be captured by models based 
on variations of a preferential attachment mechanism. [28) . and then [31] for ISI papers and 
also [25J, found that older papers enjoy an advantage in receiving citations, independently of the 
intrinsic quality of the paper. Although a bias in favor of recent papers allows for a better fit 
of certain datasets (see O [35]), the evidence of a rich-gets-richer mechanism seems sound. In 
addition, Refs. |34[ [36] have shown that this evidence can be accounted for when preferential 
attachment is generated by a random-search mechanism as the one we use in this paper, where 
authors first "randomly" (from the econometrician's perspective) select papers, and search these 
papers' reference lists to find additional citations. 
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There is less exploration of the patterns of citations across disciplines or across other types 
of categories in which research may be organized. Several works have shown that geographical 
distance and national boundaries are two important determinants of citation patterns, while |20] 
has shown that citation patterns are quite uniform across sub-fields in the high energy physics 
dataset (SPIRES). Also, [33J finds a relationship between the homophily in citing other papers and 
the total citations received by computer science papers. 

Thus, the generative process of citations is consistent with basic aspects of the network for- 
mation process studied in this paper: First, it is a growing network process, since papers appear 
in chronological order, and old papers do not exit or die. Second, citations are directional, and 
only citations from newer to older nodes are 

p0SS MeE Third, citations cannot disappear, and 
accumulate over time. In addition, and specifically to our model, nodes have "types", that we 
identify with the scientific classification of a paper (see below for details). Finally, a key element 
of our process is that links are formed both at random and by search through established links. In 
the case of citations, these two channels of search are present, since there is a difference between 
citations that come from direct knowledge of a paper and citations that originate from the list of 
references of other papers that one has read. 

We use the American Physical Society (APS) citations dataset, which reports papers published 
in journals of the APS between 1/1/1985 and 12/31/20030 There is a total of 207912 papers and 
1488866 citations (roughly 7 citations per paper on average). Around 42.8 percent of the papers 
are never cited, while the most cited one receives 952 citations. 

Types are defined by the first digit of the first (out of at most four) PACS classification code 
that characterize each paper: 

• 00: General; 

• 10: The Physics of Elementary Particles and Fields; 

• 20: Nuclear Physics; 

• 30: Atomic and Molecular Physics; 

• 40: Electromagnetism, Optics, Acoustics, Heat Transfer, Classical Mechanics, Fluid Dynam- 
ics; 

• 50: Physics of Gases, Plasmas, and Electric Discharges; 

• 60: Condensed Matter: Structural, Mechanical, and Thermal Properties; 

16 There are revisions of papers that allow them to cite relatively contemporaneous papers, but that seems to be a 
minor factor in the overall process. 

17 The journals in our data are Physical Review A, B, C, D , E, Letters, STPER and RevModPhys; considering 
papers that have PACS classification codes which became compulsory in 1985 in the main six journals of the APS. 
This dataset is available online from http://prola.aps.org/, and contains the data analyzed in [291 130j . 
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• 70: Condensed Matter: Electronic Structure, Electrical, Magnetic, and Optical Properties; 

• 80: Interdisciplinary Physics and Related Areas of Science and Technology; 

• 90: Geophysics, Astronomy, and Astrophysics. 

We first note that the time profiles of types' population shares, measured, for each type and for 
each year, as the proportion of the total papers published during that year that are of that given 
type, is fairly stationary during the whole period (see Figure [Tj) l 18 l The approximate stationarity of 
most categories is roughly in line with our assumption in the theoretical model that probabilities 
of birth of various types are time invariant. There are additional dramatic (unexplained) changes 
in the composition of fields following 2003, and so we truncate our data at that point. 

In order to identify the various elements of our theoretical model, we need to distinguish citations 
that originate from a direct random draw from the pool of all existing papers ( "random" citations) 
from those that originate from a search process that goes through the references contained in one's 
random citations ("search" citations). To do this, we proceed as follows. We first identify a citation 
from paper A to paper C as a "search" citation if there exists some paper B with the following 
properties: 1) B is published after C and before A, 2) A cites B, and 3) B cites C. 

This method obviously has some degree of arbitrariness and will not perfectly identify how the 
authors found the papers they cite. The bias of this simplification is however not clear. On one 
side, it overstates the weight of "search" in the citation process, since A may well cite C because 
C is an important paper in the field, the reason for which also B also cites C, without A having 
known about C though B. On the other side, however, it could be that authors of paper A know 
about paper C only because they came across paper B, which cites C: they could decide to cite 
only C because it contains an older version of the same idea. That is, it can be that some papers 
are found through the search process, without the authors ever citing the intermediate paper, and 
so some citations are coded as random even though they were found through search. We stick with 
the strict interpretation of the model, given that we have no other way of identifying the actual 
process that the authors followed (see [36] for an interesting strategy of identification). 

18 The only two sharp changes in the time profiles are around 1990 for type 10 (Physics of Elementary Particles and 
Fields) and type 70 (Condensed Matter Electronic Structure, Electrical, Magnetic, and Optical Properties). These 
changes were explained to us by the APS as follows (in private communication in response to our queries about 
these changes). The increase of type 70 was driven by the sharp increase in the subcategory 74 "Superconductivity", 
due to the spike in interest in High Temperature Superconductivity that began in 1986 (including some switching of 
fields of high energy particle physicists some of whom would have come from type 10). The sharp decrease of type 
10 was due to a 1989 policy by the APS that increased page charges by 60 percent in the Physical Review journals, 
including Physical Review D where much elementary particle/high energy physics is published. Some authors reacted 
by publishing in other journals (outside of the APS data set) and increased use of the physics arXiv that started in 
1991, and this reaction was particularly heavy in the particle physics community. In 1996 APS removed page charges 
for properly prepared electronic manuscripts in Physical Review C and D, and in 1999 did the same for the other 
Physical Review journals. 
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1985 1990 1995 2000 

year 



o 00 General 


♦ 10 Elementary Particles 


— — 20 Nuclear Physics 


■ 30 Atomic and Molecular 


1 40 Electromagnetism... 


— • — 50 Physics of Gases... 


— 60 Condensed Matter 1 


— * — 70 Condensed Matter 2 


□ 80 Interdisc. Physics 


x 90 Astrophysics 



1: Shares of types' proportions over time 

Using this method we identify 56.34 percent of total citations as "search" citations. We then 
classify the remaining 43.66 percent of citations as "random" citations, being the complement of 
the "search" citations. 

6.1 Bias in random out-citations 

In order to identify the bias in the random part of the process, we compare the share of "random" 
out-citations that are of the same type of the citing paper with the population share of the type 
of the citing paper. The first share (q ou t hi table [1]) is obtained by averaging the share of random 
same-type out-citations of all papers of a given type during the whole time period. The second 
share (w in table [1]) is obtained as the share of papers of a given type over all papers in the sample 
for the whole time period. 



Classification 


00 


10 


20 


30 


40 


50 


60 


70 


80 


90 


Same type random cites: q ou t 


0.27 


0.10 


0.69 


0.60 


0.42 


0.43 


0.33 


0.49 


0.23 


0.07 


Size of classification: w 


0.13 


0.10 


0.08 


0.08 


0.07 


0.02 


0.15 


0.32 


0.03 


0.02 


Coleman Index 


0.17 


0.00 


0.67 


0.57 


0.37 


0.42 


0.21 


0.26 


0.20 


0.05 



1: Same- type bias in the overall citations. 



The difference between these two shares is positive and substantial for all types, with maximum 
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value of about .61 for type 20, minimum value of .001 for type 10 (Interdisciplinary physics), and 
average value of .26. Normalizing, for each type, this difference by the the maximal potential 
difference given by one minus the population share of the type, we obtain the Coleman (1958) 
homophily index of each type (ih = (q ou t — w )/(^ ~ w ) m table [T])0 This index turns out not to 
be correlated with types' population shares. 

6.2 Search bias, long— run integration, and partial integration 

One challenge with an empirical investigation of the various concepts of integration is that certain 
papers happen to be intrinsically more cited than others, simply because they are more fundamental 
or important than others for their discipline. This type of "fitness" is independent of the age of 
the paper, and is not modeled in our analysis q More importantly, it could potentially outweigh 
the effect of time, and of the large in-degree that older nodes accumulate in time, which is one of 
the forces behind the long-run integration property. 

We deal with this problem by looking at the type-composition of the first r citations of each 
paper, thereby replacing time with citation order. This allows us to normalize the time-scale of 
each single paper, as if they all had the same fitness. In this new context, which takes a form similar 
to that of Proposition [5l the hypothesis we are testing is whether the homophily of the in-degrees 
of a paper decreases with the order of its in-citations. This is meant to capture the main force 
that leads to partial integration: the growth of nodes' in-degree is to a large extent composed of 
in-citations of the "search" type, which in the case of unbiased search are less biased towards one's 
own type than in-citations of the "random" kind. 

A way to test partial integration is to measure the probability that a given citation originated 
from a paper in the same field as the cited paper; more precisely, we estimate the change in this 
probability associated with an increase in the order of the in-citation. The partial integration 
hypothesis predicts that this probability should decrease with the order of the in-citation. We 
estimate a probit model where a dummy in~group (taking value one if the citation comes from a 
paper in the same field as the cited paper and zero otherwise) is estimated as a function of the 
order of the citation. The dataset contains 1034569 total citation, and we run separate regressions 
by looking at each sub- field separately (i.e., we look at all citations received by all papers belonging 
to each given sub-field). Results are reported in the next table. 

We observe that for all sub-fields except type 00, type 50 and type 90, citations of higher order 
are less likely to come from papers within the same field. This is true also in the aggregate if 
we compute the expected change in probability without keeping track of the specific field of the 
receiving paper. Types 00 and 50 behave differently and have an increase in the probability of 

19 This normalization has the purpose of allowing for meaningful comparison of groups of different sizes, by taking 
into account the maximal potential amount of homophily that each group has. See [5] for more discussion. 
20 See [2] for the analysis of a single-type random-search process which is based on fitness. 
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- 00095** 


000 
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00098** 


000 
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- 00067** 


000 


2D 


-ooi qi ** 


000 


30 


-.00252** 


0.000 


40 


-.00034** 


0.000 


50 


.0008** 


0.009 


60 


-.00112** 


0.000 


70 


-.00024** 


0.000 


80 


-.00418** 


0.000 


90 


.00028 


0.096 



2: Expected change in probability of a citation being homophilous associated with an increase in 
the order of the citation. (**)=99%. 

homophilous citations, while type 90 has no significant change in probability. At least broadly, 
this is consistent with partial integration. On average, the probability of being cited by papers 
of the same field decreases by 0.025%: on average, the share of in-field citations decreases by 2.5 
percentage points between the first and the 100th in-citation. 

Beyond testing partial integration, we also examine whether the fraction of a paper's citations 
from random versus search becomes more tilted towards search as its number of citations grows, 
as would be consistent with the model. This is plotted in Figure below. 

There is an evident upward trend, consistent with the model's predictions. 

7 Concluding remarks 

This paper contributes to our understanding of how heterogeneity and homophily among individuals 
impact the networks that they form. We have built on the framework of Jackson and Rogers 
(2007), in which links result both from meeting others at random and through introductions to 
their neighbors, allowing both of these channels for link formation to be biased by the types of the 
nodes involved. Some applications of interest have significant type-based biases. Scholars are more 
likely to read papers from their own field, people are more likely to befriend those with a similar 
background, organizations have closer ties within departments, and so on. We do not attempt here 
to model the source of these biases, but take the model as a reduced form representation of the 
resulting effects that such biases have on how links are formed. 

Within the context of this framework, long-run integration, whereby old nodes obtain local 
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2: Average Fraction of Search Citations by Citation Order 

networks that asymptotically resemble the population at large, occurs if and only if the search part 
of the network formation process (the second channel mentioned above) is type-unbiased so that 
the only bias in the process comes in which nodes are initially found through the random meeting 
process. 

Understanding the neighborhoods of old nodes is important since these nodes constitute the 
hubs of the network. If one is interested in processes occurring on the network such as, e.g., strate- 
gic behavior or diffusion processes, then the characteristics of hub nodes are of central importance. 
On the other hand, there are many other important properties of the network that may be affected 
by type-based biases. In order to analyze these properties, we turned to the more specific model 
of location-based biases among two groups, deriving implications regarding type-based degree dis- 
tributions and on group-level homophily. 

We leave open a number of interesting questions. First, there is the matter of exploring the ex- 
tent to which the results from the location-based biases generalize. Second, there are other kinds of 
network formation processes in which similar questions could be addressed. In fact, in the different 
model of [HlOU], there are some similar results, but in general we have an incomplete understanding 
of how heterogeneity impacts network formation. Third, there are many summary statistics of 
networks that can be generalized to a multi-type framework, including clustering measures, that 
can be analyzed in future work. 
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Appendix A Some results on Markov matrices 



This first Section of the Appendix provides some results that are necessary for the proofs of our 
results. Take an H x H Markov matrix M with all positive elements, i.e. a positive Markov matrix. 

Lemma 2 For every x > the H x H matrix 

M (x = e*-l ^ — M^= ^ V ; 

/Lt! exp (x) — 1 

zs a Markov matrix. 

Proof: for every /j, £ N, M M is a Markov matrix. To show that M(x) is a Markov ma- 
trix, we need to prove that for every i, j € {1, ... ,H} we have that < M(x)y < 1, and that 
£f =1 M(x) fcj = l. 

The first condition comes from the fact that M(x)ij is a convex combination of (an infinite number 
of) probabilities. 

The second condition comes from the fact that 

H oo ^ I H \ oo ^ 

k=i m=i ^" \fe=i / m=i 



M(x) can be seen as a weighted average of the infinite elements of {M M } Me N. 
We know that 

/ v(M)' \ 



lim M M 

[l— too 



(a) 



V «(M)' / 

where the row-vector £?(M)' is the unique eigenvector associated with eigenvalue 1 of matrix M (up 
to a normalization that it's elements sum to one, by the Perron-Frobenius Theorem). We define 
this matrix at the limit, with all equal elements on each column, as M. Now we prove a relation 
between the limit of M(x) and M. 



Lemma 3 For every positive Markov matrix A, and for every couple i,j G {1, 
that 



, H}, we have 



lim [M(x)]ij = lim [M 



£t— >00 



Mi 



Proof: By definition of M, for every e > there is a number k G N, such that for every \x > k, 



we have 



[M% - [M]i 



< e. By driving x — > oo we can impose to the weight 



(e x -l)v\ 



of every 



v < k. In this way [M(x)]jj becomes a weighted average of almost only elements from {M M } MG pj, 



with n> k. As for all of them we have [M^]jj — [M]ij 



< e, we have the result. D 
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Definition 4 M satisfies the monotone convergence property if, for every couple i,j € {1, . . . , H}, 
and for every fi € N, the element has the following properties: 

1. ifM i:j > Mij, then M - > Aft > M^ +1 > M ij; 

2. if M {j < M^, then M {j < < M% +1 < M {j . 

What comes out directly from the definition is that, if Mij > Mij, then there is at least one ji 
for which the inequality is strict, i.e. Mfi > M-* +1 . 

Lemma 4 For every couple i,j S {1, . . . ,H}, and for every x > If M satisfies the monotone 
convergence property, then 

1. if M^ > M v} , then ^[M(x)] y < 0; 

2. if Mij < M^, then -jL[M{x)]ij > 0. 

Proof: We focus on case 1, as the other is proven by reversing inequalities. 
First, note that the function 



£ {e x - 1) - e x 



x 



is negative if and only if 



xe x 



H < 



e x - 1 



Let us call v{x) the minimum integer strictly above 
Now we can show that 




i.e. v{x) = \ 



xe 



1- 




< 




It is a matter of calculus to check that 




and then the derivative in (Jb]) is strictly negative. | 



Finally, we provide a simple sufficient condition for a 2 x 2 Markov matrix. 



Lemma 5 Consider the 2 x 2 Markov 




monotone convergence property. 
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/ ^ \ — u \ 

Proof: By the Perron-Probenius Theorem this matrix converges to , with u 



u 1 — u 



k-2 



(i_p )+ (i_ (? ) • 

It is easy to check that, as p > 1 — q, then p > u. By symmetry between p and q, also q > 1 — u. 

(a. 1 — a \ 
, such that p > a > u and o > B > 1 — u. To finish the 
1-/3 &y L ~ - 

proof it is enough to show that 

p > ap + (1 — a)(l — q) > u , (c) 

as it will be proved symmetrically also with respect to q and B. The middle term of these inequalities 
is increasing in a, as p > 1 — q. If a = u it is equal to u, if instead a = p, with some algebraic 
substitution, we have that again both inequalities are satisfied, as p > 1 — q. This completes the 
proof. | 

Appendix B Proofs 
B.l Proofs for Section [Ml 

Proof of Proposition [T] (page [10T> : Note first that the node born at time t' in definition Q] has, 
at the beginning of time if + 1 (before node t' + 1 sends its links) an in-degree of 0. This directly 
implies that the probability of t' to receive a link at time if + 1 from a node of type 0", given that 
such a node is born, is equal to the probability of being found at random among the t' nodes in 
the network. This probability is equal to: 

^P(O) • (d) 

On the other hand, the probability that node to is be found at time t' + 1 is the sum of the 
probability of being found at random and through search. In the model with homogeneous search, 
this is: 

nrrir (an an w , la" n\ n *o ^' ^M) 1 / \ 

— ,e(to)) + nm s ^ P (6 ,6) - . (e) 

Note that in (jej) the terms in the vector U^ Q (9,6(to)) grow without bound as t' tends to infinity, 
while the first terms in (jej) and in (Jd|) are constant once t' is eliminated from the denominator of 
both expressions. It follows that we can always choose a t' large enough for (jej) to be larger than 
©. | 

Proof of Proposition [2] (page We want to see how the matrix Ir£ of type-by-type links 
for a node born at time to develops. To do this we compare its behavior with the behavior of the 
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type-blind process, where the in-links evolve according t(Q 



7r t0 (t) = n 



m r 



To make this comparison in the long run we study 



TT* 

lim-% . 



(f) 



Consider the solution of the model, as described by equation ([5]), with the decomposition B r = 
QAQ 1 . 

We rewrite ([5]) as: 

We now use some results from Section [3.11 By equation ([2]), and the facts that I = QIQ 1 and 
A n = QA n Q~ 1 , we obtain: 



rr* 



n- 



m. 



//! 



m r lit 



n- 



i Q 



m s \\t 

Limit ([f| implies that (we use Lemma from Appendix A ) 

n' 



/i! 



(g) 



lim 



lim Q 



t— kx> 7r(i) t— >oo 

= Q (lim An Q 1 



/i! 



Q 



/ v(AY \ 



V v(A)' J 



Q 1 



(h) 



where the row-vector v(A)' is the unique eigenvector associated with eigenvalue 1 of matrix A 
(normalized to sum to 1). In this way, in the long run a node of type i born at time to receives a 
fraction of in-links from nodes of type j which is given by the ratio 



pU)- 



p(i) 



lr This process reduces to the 1-type case studied in Jackson and Rogers (2007). 
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of the overall nodes that it would receive in a type-blind process. This proportion is the product 
of p(j) times a term that is constant for type i. | 



Proof of Proposition [3] (page The result comes from the expression of matrix n* Q as 

defined in equation (jgj), in the Proof of Proposition [2j 



TT* 

n (t) 




in 



-A" Q 



-i 



Here 



m s \ —1 

— 1 I is just a rescaling term so that the matrix in brackets is again a Markov 



matrix (see Lemma [2] in Appendix A). From the Proof of Proposition [2] we know that it converges 
to the distribution of the population shares. As A satisfies the monotone convergence property, we 



can apply Lemma 2] from Appendix A to prove that this convergence is monotonic. | 



Proof of Proposition [4] (page 113ft : 

Expressing the steady state equation (j9|) we obtain 



D = (1 — m s ) (I — m s B r ) _1 B r . 



(i) 



Using the algebraic identity 



(I-m s B r ) 1 = ^(m s B r ) / " , 



we obtain the following expression: 



D = B 



1 - m. 



m. 



In the above expression, the matrix in brackets is such that, as m s — >• 1, the elements of each 



column homogenize (see Lemma [3] of Appendix A). However, full homogeneity only occurs at the 
limit m s — > 1. 

To obtain some insight on the time evolution of the out-degree, let us express equation ([8]) as 
a differential equation, and solve it explicitly (as we have done in ([5]) for the in-degree). 
The system is 



d A t 

—A t = (l-m s )B r + m s B r — , 



0) 



with solution: 



Dt + Ct 



m s B r 
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where C is a constant matrix. 

For a given initial condition Di (that we can identify with the matrix A of biases) the solution 
for Dt can be written as: 



I 

B.2 Proofs for Section [H 

For the proofs of this section we need an additional preliminary result, that follows here below 

Consider a degree distribution F(k) obtained implicitly through a process such that the growth 
of a node born at to is governed by k to (t) = f(t/to) and another degree distribution G such that 
kt Q (t) = 5(*Ao)> with kt (to) = 0. Assume / and g are weakly increasing and continuous on [1, +oo[ 
and that lim^oo f(x) = ]im x ^ oc g(x) = 00. 

Lemma 6 F First-Order Stochastically Dominates G if and only if for all x > 1, f(x) > g{x), with 
strict inequality for some x. 

Proof of Lemma [6} Assume f(x) > g(x) for all x > 1. Pick k and t arbitrarily. Define if 
as the birthdate of the node with degree k at time t under /, and similarly for i g . We have 
/(*/*/) = k > g(t/if), which, since g is non-decreasing implies that i g < if. Since Ft(k) = 1 — if ft 
and G t (k) = 1 - i g /t, we have F t (k) < G t {k). 

Now take x such that f(x) > g(x). Pick k and t arbitrarily. Define if = t/x and k to be the 
size of node if at time t under /. Then set i g to be the node with degree k at time t under g. We 
have k = f{t/if) = f(x) > g(x) = g(t/if), which implies that i g < if. Thus F t (k) < Gt(k). 

To show necessity, fix t and choose if so that f(t/if) < g(t/if), and set k = f(t/if). Defining i g 
as the node with degree k at time t under /, we know that i g >if. This implies that Gt(k) < Ft(k), 
completing the proof. | 

Now we can proceed with the proofs of Section HI 

Proof of Lemma [1] (page [T5l) : Apply to this particular case the expression from ([5]), considering 
the decomposition from ([2]) : 



S *=» + 7 <Di-D)f 



.m s B r 



00 




\ 



where we have called p± = p(0i,9i) and p 2 = p(#2> ^2)- 
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As can be directly computed checking [22], we have that 



Pi 1 - Pi 

^\ \1-P2 P2 J 

to 



I (l-pi)t ms <Pl+P2- 1 )+(l-p 2 )t" 

1-px-Pi 



\ 



(l-p 2 ) [tf -t ms (P1+P2-1) ) 
2— pi-pa 



where we have reported only the first column. 



Recall that pi 



FT 



(l-p)(l-7)+P7 "r p(l-7)+(l-pb 

manipulations show that - 



P(l~7) 2 



and P2 



(1-P)7 2 
p(l- 7 )+(l-p) 



+ 



1 

J 

(l-p)(l~ 7 ) 2 
(l-p)(l-7)+P7 ' 



Some 



that Q 



p 
1-p 



. _ - 1 — p, and symmetrically 2 -p^-p 2 = ^ we finally consider 
we have the result. | 



Proof of Proposition [5] (page 1151) : We take the case of 9\. The case of 62 is analogous. Define 
/0) = nm r /m s (i m ' - 1); hence = (1 + ^) 1/ms - Next define g(x) = p + (1 -p)(x bms - 

l)/(x ms — 1). Notice that at time t, the proportion of in-links that a node of type 9\ born at time 
to < t has from its own group is 

nl(i,i) 



14 (1,1) +1^(2,1)- 

It then follows from Lemma 1 that r 1 (fe) = g(f~ 1 (k)). Evaluating this formula delivers the claimed 
expression. 

Without loss of generality, we can set nm r /m s = 1 in what follows. Introduce y = 1 + k. Next, 
r " = [(f~ )'Yd" / + if -1 )" 9' / ■ Developing and substituting shows that r" has the same 
sign as 

<p(y) = y b+2 (l - 6)(2 - b) + y b+1 2b{2 - b) - y b b(l - b) - 2y 2 

A detailed study of cp and its first three derivatives then shows that ip(y) > if y > 1, hence that 
r"(k) > if k> 0. 

The explicit expressions for the derivative with respect to p are not trivial given that 6 is a 
function of p. We have 



Or 
dk 



-(1-p) 



(l + A;) b - 1 (l + (l-b)A:)-l 
P 



k\l + k) 1 -'^ = ip(k) = 1 + (1 - b)k - (1 + k) 1 -' + (1 - p)(-^) [hi(l + k)(l + (1 - b)k) - k] 
Also, note that r(0) = p + (1 - p)6. Thus, g(0) = 1 - b + J|(l - p). We have: 6 = 1- 



—p. — ■ U ^ 1 -,X > , — r and ^ = _ ( 2 p lHgl_JJ ^ lL i Developing, we get that l£(0) has the same 

p(l-p)(27-l) 2 +7(l-7) dp [p(i_ p )(2 7 -l)2 +7 (i_ 7 )]- ! K to> fa dpW 



! I „ 1 P1 — , „ 1 P2 — ] is actually the eigenvector of I 



Pi 1 - Pi 

-P2 P2 



associated to its asymptotic limit 



Pi 1 - Pi 



limt^oo I 1 ] (see the proof of Lemma \E\ in I Appendix A|). Considering the location-based model, 

V 1 ~ P2 P2 J " 

it is reasonable that this limit does not depend on 7 but only on the initial distribution of the two types, given by p. 
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sign as 1 - (2 7 - l) 2 p(2 -p) - 37(1 - 7) > 1 - (2 7 - l) 2 - 37(1 - 7) = 7(1 - 7) > where the first 
inequality comes from the fact that p(2 — p) < 1. Thus, §^(0) > and 1 — 6 > (— — p)- 
Next, derive the function ?/> with respect to k. We have: 



(l + fc)- b ) + (l 
dp' 



P)(" 



1 

<9p ■ 



6)^(1 + ^-6(1-1 



and 



(1 " -&)-(!- P)(-£)K1 + k)- 1 + 6(1 - 6)(1 + *;)" 



^'(fc) = (l-6)(l 

(1 + kW(k) 

Here, V"(0) = 6(1 - b) + (1 - 26). Since 1 - 6 > -p), we have ^"(0) > 

(1 — ||)(1 — 6) > 0. Also, limfc^oo tp"(k) = + . By looking at its derivative, we see that the 
function (1 + k)tp"(k) is either decreasing, or increasing and decreasing. In either case, since it is 
positive when k = and when k — > 00, it must be greater than or equal to zero for any k. Thus, 
ip" > if k > hence ip' is increasing. Since tp'(0) = 0, ip' > if k > 0. Thus, ip is increasing and 
as V>(0) = 0, V > and > if k > . Finally, given that §^(0) > and |^ is increasing in k, 
% > 0, V*. I 

Proof of Proposition [6] (page [T6lh For (i), use Lemma [T] to write 



n| (i,i) 
nl(M) 



n — p 

m s 

m r 

n (1 

m.. 



'to 
-p) 



\bm s 



"to' 



+ n- 



( _\bm s 
to' 



to 



l_\bm s 

V 



Given that p > 1/2 and that 6 > 0, we know that ^— - < 1 and the second term in the first equation 
is non-negative. Thus IT| (1, 1) > n| (2, 1) for all t > to, which allows us to apply Lemma [6j 

Now consider the expressions for n£ (2,2) and nf (l,2) obtained from the ab ove equations by 
exchanging p with 1—p. When p > 1/2 (meaning 9\ is the majority group) then > 1, and when 
6 < 1 (i.e., 7 < 1, meaning there is at least some inter-group linking) then for large values oft/to the 
second term in the expression for II* (2, 2) becomes negligible, in which case II^ (2,2) < n* (1, 2) 
in the upper tail, proving (ii) by application of Lemma 

For (iii), introduce the function tp(x) = p(l — p)[x ms — x brUa ] — (1 — p)[(l — p)x ms +px brrLs — 1]. 
Note that rp(t/t ) > if and only if IT, (1, 2) > IT, (2, 2). Observe that ip(l) = 0. Also, 



x 



(2p - 1)(1 - p)m s - 2p(l - p)6m s x (fe_1)ms 



Since 6 — 1 < 0, the second term of the RHS is weakly increasing in x. There are two cases. First, 
ip'{l) > 0, in which case Vx > l,ip'(x) > 0, thus t/j is weakly increasing and \/x > l,ift(x) > 0. 
Otherwise ift'(l) < 0, in which case tp' is first negative then positive above 1 (since i/j'(oo) = 00), 
hence tp is first decreasing and then increasing, which also means that ip is first negative and then 
positive above 1. Therefore, -F21 FOSD F22 if and only if ^'(1) > 0. The condition reduces to 



6 < 



2p- 1 
2p 
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For (iv), working under the original model, previous equations reduce to II| (1, 1) = nm r /m s [p{j^) ma + 

(1 -P)(r ) bms) ~ 1]' = nm r /m s (l-p)[(i) ma ~ (r ) {bm °] and 1^(1, 1) + n*,(2, 1) = 

nm r / m s [(-^) ms — 1], which does not depend on p. This proves the result. | 



Proof of Proposition!?! (page l!7|) : Observe that 6 increases with 7. This means that (n* )'(l, 1) > 
n* (l,l) and (n| Q )'(2,2) > 1^(2,2) while (Il| )'(2,l) < 1^(2,1) and (I^)'(l,2) < nf (l,2). The 
result then follows from Lemma [6] | 

Proof of Proposition [8] (page [18]): Substituting from equation ([TO]) , we have 

(i-2 7 )Vp(i-p) 



IH x (jp) 



ap(l - p) + (1 + (1 - 2p) 2 a) 7 (l - 7) 



From this expression, it is easily verified that IH\[p) = IH\(l —p) and that IH\(0) = IHi(l) = 0. 
The first derivative of IH\ is 

dlH^p) _ (1 - 2p)o{o + 1)(2 7 - 1) 2 7 (1 - 7) 



dp " ((a + 1)7(1 - 7) + (2 7 - 1) V(l " P)f ' 

which has the same sign as 1 — 2p, proving that I Hi is increasing below p = 1/2 and then decreasing. 
To show concavity, write the second derivative as 

d 2 IHi(p) _ 2 7 (1 - 7) (2 7 - Ifojp + 1) * {a{3p 2 - 3p + 1) - 7(1 - l)(M^P ~ I) 2 - 1)) 
dp 2 ~ -o- 3 (7(l -7)((2p- l) 2 + l)+p(l-p)) 3 

The denominator is negative, and the term in the numerator before the asterisk is positive, so I Hi 
is concave if and only if <r(3p 2 — 3p + 1) — 7(1 — 7)(3<r(2p — l) 2 — 1) > 0. Dividing by a and 
rearranging, we must show that 7(1 — 7 )(3(2p — l) 2 — l/cr) + 3p(l — p) < 1. 7(1 — 7) < 1/4 and 
—1/a < 0; using these inequalities and collecting terms proves the result. | 

Proof of Corollary [JJ (page [19]) : The relevant derivatives are 

dIHi (i_27) 2 7(l- 7 )p(l- p ) 



da (o-p(l-p) + 7(l- 7 )(l + (l-2p) 2 ) ( 7) 2 

dIHi _ (2 7 -l)p(l-p)cr(l + o-) 

d 7 ~ (ap(l -p) + 7(1 - 7 )(1 + (1 - 2pY)a) 2 ' 

both of which are easily verified as being positive. | 
B.3 Proofs for Section [5l 

Proof of Proposition [9] (page 1201) : For what concerns the weak integration property, see the 
Proof of Proposition [TJ The only thing to change is the right-hand part of equation (|14p instead 
of the formula in (jej). 
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For long-run integration, consider the solution to the general model, with biased search, as 
described by equation (fT5l) . We follow the same procedure as in the proof of Proposition O since 
B s A is still a Markov Matrix. We obtain 



lim — — 



(B s 0B r ) _1 B r Q 



7T to (t) 



( v{B s A)' \ 



\ v(B a A)' J 



Q 1 • 



(i) 



In the long run a node of type i born at time io will receive a number of in-links from nodes of 
type j which is a fraction 



limA- 

t^oo TTt (t) 



H , 

^([(B 8 0B r ) -1 B T 



h=l 
H / H 



. u P(h) tt: — 



h=i \k=i ^ 3 

H / H 

E(E([(B,0B r )-'] 3 . t p(*) 



p(k,h)\ v(B s QA)i 
p(h)- 



K h=l k=l 
H 



jk 



p(k)p(k, h) 



p(i) 
v(B s A), 



p(i) 



vfe=l 



B s B r 



(m) 



of the overall links that it would receive in a type-blind process, where the last line comes from 
the fact that ^2? = iP{k, h) = 1. The second term is still a constant for type i, but the first term is 
generically not proportional to p(j). I 

Finally, for the partial integration property, the proof is analogous to the Proof of Proposition 
[3l In this case 

. ™ i fvi / 1 / £ \ 

t 



Ul ° — (B 8 Bt-) -1 B r Q f ffrl m '-l)" 1 E ^' l0g r // (B s 0Af 1 Q 1 . 



As B s A satisfies the monotone convergence property, we can use Lemma H] from Appendix A 
and the same reasoning applies. | 
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